22 research outputs found

    Buffered Qualitative Stability explains the robustness and evolvability of transcriptional networks

    Get PDF
    The gene regulatory network (GRN) is the central decision‐making module of the cell. We have developed a theory called Buffered Qualitative Stability (BQS) based on the hypothesis that GRNs are organised so that they remain robust in the face of unpredictable environmental and evolutionary changes. BQS makes strong and diverse predictions about the network features that allow stable responses under arbitrary perturbations, including the random addition of new connections. We show that the GRNs of E. coli, M. tuberculosis, P. aeruginosa, yeast, mouse, and human all verify the predictions of BQS. BQS explains many of the small- and large‐scale properties of GRNs, provides conditions for evolvable robustness, and highlights general features of transcriptional response. BQS is severely compromised in a human cancer cell line, suggesting that loss of BQS might underlie the phenotypic plasticity of cancer cells, and highlighting a possible sequence of GRN alterations concomitant with cancer initiation. DOI: http://dx.doi.org/10.7554/eLife.02863.00

    Universal attenuators and their interactions with feedback loops in gene regulatory networks

    Get PDF
    Using a combination of mathematical modelling, statistical simulation and large-scale data analysis we study the properties of linear regulatory chains (LRCs) within gene regulatory networks (GRNs). Our modelling indicates that downstream genes embedded within LRCs are highly insulated from the variation in expression of upstream genes, and thus LRCs act as attenuators. This observation implies a progressively weaker functionality of LRCs as their length increases. When analyzing the preponderance of LRCs in the GRNs of Escherichia coli K12 and several other organisms, we find that very long LRCs are essentially absent. In both E. coli and M. tuberculosis we find that four-gene LRCs are intimately linked to identical feedback loops that are involved in potentially chaotic stress response, indicating that the dynamics of these potentially destabilising motifs are strongly restrained under homeostatic conditions. The same relationship is observed in a human cancer cell line (K562), and we postulate that four-gene LRCs act as 'universal attenuators'. These findings suggest a role for long LRCs in dampening variation in gene expression, thereby protecting cell identity, and in controlling dramatic shifts in cell-wide gene expression through inhibiting chaos-generating motifs.</p

    Identification of 2R-ohnologue gene families displaying the same mutation-load skew in multiple cancers

    Get PDF
    The complexity of signalling pathways was boosted at the origin of the vertebrates, when two rounds of whole genome duplication (2R-WGD) occurred. Those genes and proteins that have survived from the 2R-WGD—termed 2R-ohnologues—belong to families of two to four members, and are enriched in signalling components relevant to cancer. Here, we find that while only approximately 30% of human transcript-coding genes are 2R-ohnologues, they carry 42–60% of the gene mutations in 30 different cancer types. Across a subset of cancer datasets, including melanoma, breast, lung adenocarcinoma, liver and medulloblastoma, we identified 673 2R-ohnologue families in which one gene carries mutations at multiple positions, while sister genes in the same family are relatively mutation free. Strikingly, in 315 of the 322 2R-ohnologue families displaying such a skew in multiple cancers, the same gene carries the heaviest mutation load in each cancer, and usually the second-ranked gene is also the same in each cancer. Our findings inspire the hypothesis that in certain cancers, heterogeneous combinations of genetic changes impair parts of the 2R-WGD signalling networks and force information flow through a limited set of oncogenic pathways in which specific non-mutated 2R-ohnologues serve as effectors. The non-mutated 2R-ohnologues are therefore potential therapeutic targets. These include proteins linked to growth factor signalling, neurotransmission and ion channels

    Robust And Scalable Learning Of Complex Dataset Topologies Via Elpigraph

    Full text link
    Large datasets represented by multidimensional data point clouds often possess non-trivial distributions with branching trajectories and excluded regions, with the recent single-cell transcriptomic studies of developing embryo being notable examples. Reducing the complexity and producing compact and interpretable representations of such data remains a challenging task. Most of the existing computational methods are based on exploring the local data point neighbourhood relations, a step that can perform poorly in the case of multidimensional and noisy data. Here we present ElPiGraph, a scalable and robust method for approximation of datasets with complex structures which does not require computing the complete data distance matrix or the data point neighbourhood graph. This method is able to withstand high levels of noise and is capable of approximating complex topologies via principal graph ensembles that can be combined into a consensus principal graph. ElPiGraph deals efficiently with large and complex datasets in various fields from biology, where it can be used to infer gene dynamics from single-cell RNA-Seq, to astronomy, where it can be used to explore complex structures in the distribution of galaxies.Comment: 32 pages, 14 figure

    Inevitability and containment of replication errors for eukaryotic genome lengths spanning Megabase to Gigabase

    Get PDF
    The replication of DNA is initiated at particular sites on the genome called replication origins (ROs). Understanding the constraints that regulate the distribution of ROs across different organisms is fundamental for quantifying the degree of replication errors and their downstream consequences. Using a simple probabilistic model, we generate a set of predictions on the extreme sensitivity of error rates to the distribution of ROs, and how this distribution must therefore be tuned for genomes of vastly different sizes. As genome size changes from megabases to gigabases, we predict that regularity of RO spacing is lost, that large gaps between ROs dominate error rates but are heavily constrained by the mean stalling distance of replication forks, and that, for genomes spanning ∌100 megabases to ∌10 gigabases, errors become increasingly inevitable but their number remains very small (three or less). Our theory predicts that the number of errors becomes significantly higher for genome sizes greater than ∌10 gigabases. We test these predictions against datasets in yeast, Arabidopsis, Drosophila, and human, and also through direct experimentation on two different human cell lines. Agreement of theoretical predictions with experiment and datasets is found in all cases, resulting in a picture of great simplicity, whereby the density and positioning of ROs explain the replication error rates for the entire range of eukaryotes for which data are available. The theory highlights three domains of error rates: negligible (yeast), tolerable (metazoan), and high (some plants), with the human genome at the extreme end of the middle domain

    Thymic involution and rising disease incidence with age

    Get PDF
    For many cancer types, incidence rises rapidly with age as an apparent power law, supporting the idea that cancer is caused by a gradual accumulation of genetic mutations. Similarly, the incidence of many infectious diseases strongly increases with age. Here, combining data from immunology and epidemiology, we show that many of these dramatic age-related increases in incidence can be modeled based on immune system decline, rather than mutation accumulation. In humans, the thymus atrophies from infancy, resulting in an exponential decline in T cell production with a half-life of ∌16 years, which we use as the basis for a minimal mathematical model of disease incidence. Our model outperforms the power law model with the same number of fitting parameters in describing cancer incidence data across a wide spectrum of different cancers, and provides excellent fits to infectious disease data. This framework provides mechanistic insight into cancer emergence, suggesting that age-related decline in T cell output is a major risk factor

    Unreplicated DNA remaining from unperturbed S phases passes through mitosis for resolution in daughter cells

    Get PDF
    To prevent rereplication of genomic segments, the eukaryotic cell cycle is divided into two nonoverlapping phases. During late mitosis and G1 replication origins are “licensed” by loading MCM2-7 double hexamers and during S phase licensed replication origins activate to initiate bidirectional replication forks. Replication forks can stall irreversibly, and if two converging forks stall with no intervening licensed origin—a “double fork stall” (DFS)—replication cannot be completed by conventional means. We previously showed how the distribution of replication origins in yeasts promotes complete genome replication even in the presence of irreversible fork stalling. This analysis predicts that DFSs are rare in yeasts but highly likely in large mammalian genomes. Here we show that complementary strand synthesis in early mitosis, ultrafine anaphase bridges, and G1-specific p53-binding protein 1 (53BP1) nuclear bodies provide a mechanism for resolving unreplicated DNA at DFSs in human cells. When origin number was experimentally altered, the number of these structures closely agreed with theoretical predictions of DFSs. The 53BP1 is preferentially bound to larger replicons, where the probability of DFSs is higher. Loss of 53BP1 caused hypersensitivity to licensing inhibition when replication origins were removed. These results provide a striking convergence of experimental and theoretical evidence that unreplicated DNA can pass through mitosis for resolution in the following cell cycle
    corecore